**** Cavaille / Trump
*** JOP
*** Appendix for CFA
*** Model fitting 
*************************

set more off


***NB: all variables have been recoded from 1 most "conservative" answer to 5 (or 10) most "liberal" answer. 


*************************
*** Table 1. ************
*************************

**** Dataset : ess_2008_for_rep

****
*** Example below is with French data (see results and comments in the appendix) 
*****



clear all
use "/Users/XXX/ess_2008_for_rep.dta"

keep if cntry == "FR"

*** saturated model 

sem (<- sblazy sblwlka sblwcoa uentrjb bennent prtsick gincdif_n smdfslv_n dfincac gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), stand 

estat gof, stats(all)

*** AIC for France is 58478

*** M1: unidimensional hypothesis

sem (UNI -> sblazy sblwlka sblwcoa uentrjb bennent prtsick gincdif_n smdfslv_n dfincac gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), stand

estat gof, stats(all)

*** AIC is 62067

*** M2 : two dimensions 

sem (WELF -> sblazy sblwlka sblwcoa uentrjb bennent prtsick) (GOVEQUA ->  gincdif_n smdfslv_n dfincac gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), stand

estat gof, stats(all)

*** AIC is 59808, sharp improvement

*** M3: three dimensions 

sem (WELF -> sblazy sblwlka sblwcoa uentrjb bennent prtsick) (EQUA ->  gincdif_n smdfslv_n dfincac) (GOV ->  gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), stand

estat gof, stats(all)

*** AIC is 59365, the improvement is less striking than when moving from one to two dimensions. 

*** we are still far from the baseline AIC, step one is to look at the modification indices to see what are the sources of misfit. Misfit can come either
*** from an item being correlated with one of the other components/dimensions or from the fact that the item-specific errors of two items are correlated, 
*** because of similar wording for instance

estat mindices

*** we find five sources of misfit with very large modification indices: 1) sblwlka and sblwcoa have shared variance most likely due to a similar wording
*** 2) govpdlw and gvcldcr as well as 3) gvslvol and gvhlthc, also have shared variance. This can be expected as support for government 
*** provided health care and old age pension is in countries like France close to universal. Support for new needs such as 
*** childcare or income loss due to one's caretaker responsability, is on the other hand less consensual (see Silja Hausermann's work on these issues) . 
*** In terms of error correlation, one might expect support for public pensions to be highly correlated with support for a public provision of healthcare.
*** The other two policies on the other hand might better capture commitment to government income protection beyond the core constituted by pension and healthcare.
*** capturing one's commitment to having the welfare state address income insecurity more generally. 
*** An additional source of misfit is 4) a correlation between uentrjb, bennent and prstick which indicates a potential third component shaping answers to these three items.
*** The final source is the probability that non-anchor items that have been constrained to load on one dimension only (thus behave like anchor items) load on other latent dimensions. 
*** gvslvue and dfincac are potential candidates because gvsvlue asks about helping the "other" (unemployed workers) and because dfincac taps into beliefs about 
*** the role of inequality on incentives , which Svallfors has argued, is different from attitudes toward egalitarian redistribution. Indeed we find modification indices to be
*** high for these two items with gvslvue and dfincac potentially loading on the WELF dimension. 


sem (WELF1 -> sblazy sblwlka sblwcoa gvslvue dfincac) (WELF2 -> uentrjb bennent prtsick gvslvue dfincac) ///
(EQUA ->  gincdif_n smdfslv_n dfincac) (GOV ->  gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), ///
stand cov(e.gvpdlwk*e.gvcldcr)  cov(e.gvslvol*e.gvhlthc) cov(e.sblwlka*e.sblwcoa)

estat gof, stats(all)
estat mindices


*** AIC 58612 which is very close to the 58478 of the saturated model. 

**** We have found the best fitting model, let's now look at the theoretical meaning by looking more closely at the loadings and their p-values. 
*** First, there is no reason to include dfincac as loading on both WELF and EQUA as the loading on WELF is substantively small and 
*** and barely significant. In addition, WELF1 and WELF2 are highly correlated (corr = 0.80) indicating that
*** constraining these items to load on the same dimension while resulting in a lower goodness of fit does not joeopardize interpretation of the data structure.
***  Allowing the items to load on the same dimension while letting the error terms between uentrjb, bennent and prstick covary is a better theoretical fit.

*** We thus obtain the final model : 

sem (WELF -> sblazy sblwlka sblwcoa gvslvue uentrjb bennent prtsick gvslvue) ///
(EQUA ->  gincdif_n smdfslv_n dfincac) (GOV ->  gvjbevn gvslvue gvslvol gvhlthc gvpdlwk gvcldcr), ///
stand cov(e.gvpdlwk*e.gvcldcr)  cov(e.gvslvol*e.gvhlthc) cov(e.sblwlka*e.sblwcoa) ///
cov(e.uentrjb*e.bennent) cov(e.prtsick*e.bennent) cov(e.uentrjb*e.prtsick) 

estat gof, stats(all)
estat mindices



** AIC is equal to 58667, compared to  58478 in the saturated model and  58612 in the best fitting model . In addition all loadings are significant 
*** and make theoretical sense.  The same strategy was used for Germany, UK and SE with similar results (the main difference being the size of the loading factor for 
*** dfincac though it was always much smaller than the item's loading on the EQUA dimension). We use this model and analyse the results in the main paper.  




